Automatic speechreading of impaired speech

نویسندگان

  • Gerasimos Potamianos
  • Chalapathy Neti
چکیده

We investigate the use of visual, mouth-region information in improving automatic speech recognition (ASR) of the speech impaired. Given the video of an utterance by such a subject, we first extract appearance-based visual features from the mouth region-of-interest, and we use a feature fusion method to combine them with the subject’s audio features into bimodal observations. Subsequently, we adapt the parameters of a speaker-independent, audio-visual hidden Markov model, trained on a large database of hearing subjects, to the audio-visual features extracted from the speech impaired videos. We consider a number of speaker adaptation techniques, and we study their performance in the case of a single speech impaired subject uttering continuous read speech, as well as connected digits. For both tasks, maximum-a-posteriori adaptation followed by maximum likelihood linear regression performs the best, achieving a word error rate relative reduction of 61% and 96%, respectively, over unadapted audio-visual ASR, and a 13% and 58% relative reduction over audio-only speaker-adapted ASR. In addition, we compare audio-only and audio-visual speaker-adapted ASR of the single speech impaired subject to ASR of subjects with normal speech, over a wide range of audio channel signal-to-noise ratios. Interestingly, for the small-vocabulary connected digits task, audio-visual ASR performance is almost identical across the two populations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tactual Cued Speech as a Supplement to Speechreading

The Cued Speech method devised by Cornett (1967) has proven to be a highly effective means of supplementing the information available through speechreading. For example, highly trained deaf receivers of Cued Speech are able to achieve nearly perfect reception of cued conversational sentences (e.g., Nicholls & Ling, 1982; Uchanski et al., 1992). The success of this method, combined with recent a...

متن کامل

A speechreading aid based on phonetic ASR

Manual Cued Speech (MCS) is an effective method of communication by the deaf and hearing-impaired. We first describe our work on assessing the feasibility of automatic determination and presentation of cues without intervention by the speaker. The conclusions of this study are then applied to the design and implementation of a prototype automatic cueing system using HMM-based automatic speech r...

متن کامل

Automatic speech recognition to aid the hearing impaired: prospects for the automatic generation of cued speech.

Although great strides have been made in the development of automatic speech recognition (ASR) systems, the communication performance achievable with the output of current real-time speech recognition systems would be extremely poor relative to normal speech reception. An alternate application of ASR technology to aid the hearing impaired would derive cues from the acoustical speech signal that...

متن کامل

Exploiting lower face symmetry in appearance-based automatic speechreading

Appearance-based visual speech feature extraction is being widely used in the automatic speechreading and audio-visual speech recognition literature. In its most common application, the discrete cosine transform (DCT) is utilized to compress the image of the speaker’s mouth region-of-interest (ROI), and the highest energy spatial frequency components are retained as visual features. Good genera...

متن کامل

Beyond lips: Components of speechreading skill by

The purpose of the present thesis was threefold. First, to study perceptual and cognitive correlates to individual differences in speechreading performance. Second, to examine certain aspects of word-decoding/discrimination in the speechreading process. Third, to investigate whether hearing-impaired individuals compensate for their hearing loss by means of improved speechreading ability. The re...

متن کامل

Speechreading in Deaf Adults with Cochlear Implants: Evidence for Perceptual Compensation

Previous research has provided evidence for a speechreading advantage in congenitally deaf adults compared to hearing adults. A 'perceptual compensation' account of this finding proposes that prolonged early onset deafness leads to a greater reliance on visual, as opposed to auditory, information when perceiving speech which in turn results in superior visual speech perception skills in deaf ad...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001